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Abstract — This paper introduces a particular game for- 
mulation and its corresponding notion of equilibrium, 
namely the satisfaction form (SF) and the satisfaction 
equilibrium (SE). A game in SF models the case where 
players are uniquely interested in the satisfaction of some 
individual performance constraints, instead of individual 
performance optimization. Under this formulation, the 
notion of equilibrium corresponds to the situation where 
all players can simultaneously satisfy their individual 
constraints. The notion of SE, models the problem of QoS 
provisioning in decentralized self-configuring networks. 
Here, radio devices are satisfied if they are able to provide 
the requested QoS. Within this framework, the concept 
of SE is formaUzed for both pure and mixed strategies 
considering finite sets of players and actions. In both cases, 
sufficient conditions for the existence and uniqueness of the 
SE are presented. When multiple SE exist, we introduce 
the idea of effort or cost of satisfaction and we propose a 
refinement of the SE, namely the efficient SE (ESE). At the 
ESE, all players adopt the action which requires the lowest 
effort for satisfaction. A learning method that allows radio 
devices to achieve a SE in pure strategies in finite time and 
requiring only one-bit feedback is also presented. Finally, 
a power control game in the interference channel is used 
to highlight the advantages of modeUng QoS problems 
following the notion of SE rather than other equilibrium 
concepts, e.g., generalized Nash equiUbrium. 



I. Introduction 

Nowadays, in the context of decentralized networks, 
a well accepted idea among radio device manufacturers, 
network designers and service providers is to consider 
radio devices as smart entities able to autonomously 
adopt the transmit/receive configuration that allows the 
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achievement of certain level of individual quality-of- 
service (QoS). In particular, a transmit/receive config- 
uration can be described in terms of power allocation 
polices, coding-modulation schemes, scheduling polices, 
decoding order, etc. Within this framework, the resulting 
decentralized radio resource sharing problem can be 
analyzed using tools from both non-cooperative game 
theory (GT) [H, Q and variational inequality (VI) 
theory IS, IH. Indeed, strong connections between both 
approaches have been already highlighted HI, ||6l. From 
this point of view, the resulting competitive interaction 
through mutual interference between all radio devices 
can be modeled by a game in normal form with coupled 
action sets (See |2| and references therein). Here, radio 
devices are players aiming to optimize an individual 
benefit (performance metric) and guarantee certain indi- 
vidual constraints (quality-of-service, QoS) by selfishly 
choosing their actions, i.e., the transmit/receive config- 
urations. In this context, the above mentioned coupling 
between the actions sets stems from the fact that the set 
of transmit/receive configurations that satisfy the QoS 
of a given radio device depend on the configurations 
adopted by all the other devices. Under this game 
formulation, initially proposed in Q, lH, the notion 
of Nash equilibrium (NE) |[9l is transformed into the 
notion described in Q, which is known nowadays as 
generalized Nash equilibrium (GNE). The notion of 
GNE is already well integrated in the context of QoS 
provisioning in decentralized networks. For instance, it 
is used in ifTOl and ifTTI to model a network where radio 
devices aim to minimize the transmit power consumption 
while guaranteeing some individual minimum signal 
to interference plus noise ratios (SINR). Therein, the 
existence and uniqueness of the GNE is discussed. In 
particular, both articles provide sufficient conditions for 
the convergence of the best response dynamics |[T2l to 
the GNE, in the context of finite action sets (only in 
lITTI ) and convex and closed action sets (in |[TOl and 
lOTI ). Other examples are provided in |[T3l and references 
therein. 

Interestingly, in |l5l||6l, it has been shown that a GNE 
can be formulated as VI problem as long as the set of 
actions is closed and convex and the utility functions 
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are continuously differentiable. From such a formulation, 
many conclusions regarding the existence and unique- 
ness of the equilibrium can be obtained based on existing 
results in VI theory [3]. Additionally, algorithms to 
achieve the GNE can be developed based on existing 
methods for solving the VI problem, for instance, see 
HHllTSl. Nonetheless, the use of VI in the context of 
decentralized networks is limited since it requires the 
explicit expression of the metric projection onto the set 
of actions ||T6l . lITTl . |[T8l . This condition is particularly 
difficult to be satisfied in decentralized networks since 
radio devices possess only local information |fT9l . An 
extensive analysis of VI in the context of decentralized 
wireless communications is presented in ||3l. 
In this paper, we present a game formulation and its 
corresponding notion of equilibrium, namely the satis- 
faction form (SF) and the satisfaction equilibrium (SE). 
The satisfaction form models the case where players 
are concerned by the satisfaction of their individual 
constraints but not the optimization of its performance 
metric. A game is said to be in SE, when all players 
simultaneously satisfy their individual constraints. In the 
sequel, we exclusively focus on the case of finite sets, 
since we aim to model the fact that radio devices choose 
from a finite number of transmit configurations. 
The idea behind SE was originally introduced in |[20l . 
ETI for a particular class of conditions in pure strategies 
for the case of finite set of actions. Later, the concept 
was formulated in terms of a fixed point inclusion for 
the case of pure strategies in I22l . in the context of 
both finite action sets and convex and closed action sets. 
The advantages of the notion of SE over the classical 
notions such as GNE, at least in the domain of signal 
processing for wireless communications are multifold. 
Here, we highlight the fact that, (i) the existence of the 
SE in pure strategies is less restrictive than the notion 
of GNE. That is, a network can possess a SE but not a 
GNE. {ii) The behavioral rules to learn a SE are simpler 
than behavioral rules to learn GNE IS, IHl. Within this 
framework, the contributions presented in this paper are 
the foUowings: 

• The notions of SF and SE are formalized in both 
pure strategies (PS) and mixed strategies (MS) in 
the context of finite games. Conditions for the 
existence of the SE in PS and MS are established. 

• We introduce the notion of epsilon-satisfaction equi- 
librium (e-SE), which consists of a mixed strat- 
egy which allows all players to be satisfied with 
probability not less than a certain threshold. This 
equilibrium concept turns out to be less restrictive 
in terms of existence than the SE. 

• A refinement of the notion of SE to which we refer 



to as efficient SE (ESE) is presented as a mechanism 

for equilibrium selection involving the idea of effort 

for satisfaction. 
• A simple learning algorithm to achieve SE, based 

on the algorithms proposed in 1211 . is presented. 
The sequel of the paper is organized as follows. In 
Sec. ini the QoS provisioning problem in decentralized 
self-configuring networks is formulated. In Sec. |llll the 
notions of SF and SE are presented. In Sec. |IVl the 
existence and uniqueness of the SE are analyzed. In 
Sec. |Vl we compare the notions of SF and SE with 
existing equilibrium notions. In Sec. |Vll behavioral rules 
that allow radio devices to learn a SE are described. In 
Sec. IVIIi the notion of SE is used in the context of a 
simple power control in the interference channel where 
transmitters must guarantee a minimum transmission 
rate. The paper is concluded by Sec. IVIIII 

II. Problem Formulation 

We consider a fully decentralized network where 
transmitters communicate with their respective receivers 
by sharing a common set of radio resources, and thus, 
subject to a competitive interaction. For instance, the 
usage of the same frequency bands brings a mutual 
interference condition. Within this framework, each radio 
device, either a transmitter or receiver, aims to au- 
tonomously adjust its transmit/receive configuration in 
order to guarantee a communication with certain QoS 
level. The QoS can be described by several parameters 
depending on the type of services. Some classical ex- 
amples are constraints over the transmission rates and 
maximum delays. The key point is that the feasibil- 
ity of the QoS of each radio device depends on the 
configurations adopted by all the other devices in the 
network. Regarding the transmit/receive configurations, 
we assume that there exists a finite number of feasible 
choices for each radio device. In particular, a configura- 
tion can be described in terms of channel selection and 
power allocation policy, modulation and error correction 
schemes, constellation sizes, etc. Similarly, receivers 
might tune their scheduling, decoding order, etc. 
In this work, we consider that radio devices are selfish 
entities aiming to satisfy their QoS individual constraints. 
Each device is in particular careless of whether other de- 
vices achieve or not their required QoS. Moreover, mes- 
sage exchanging between radio devices for establishing 
a sort of coordination to jointly improve the individual 
or global performance is not considered here. This is 
basically because of the amount of signaling it might 
require and also because devices are not necessarily able 
to communicate due to the use of different physical layer 
technologies. 
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In the sequel of this article, we intend to provide a 
mathematical framework for tackling this problem using 
tools from game theory. Particular attention is given 
to real-system implementation constraints, for instance 
finite number of choices and limited numbers of bits 
dedicated to feedback. 

III. Games in Satisfaction Form and 
Satisfaction Equilibrium 

In this section, we introduce a novel game formulation 
where in contrast to existing formulations (e.g., normal 
form 13 and normal form with constrained action sets 
Q), the idea of performance optimization, i.e., utility 
maximization or cost minimization, does not exist. In our 
formulation, to which we refer to as satisfaction-form, 
the aim of the players is to adopt any of the actions which 
allows them to satisfy a specific condition given the 
actions adopted by all the other players. Under this game 
formulation, we introduce the concept of satisfaction 
equilibrium. 

A. Games in Satisfaction Form 

In general, a game in satisfaction-form can be de- 
scribed by the following triplet 

^=(/c,{AW'{/feW)- (1) 

Here, the set IC = {1,...,/^} represents the set of 
players and the set Ak = {A^k \ ■ ■ ■ j^i^*^^} represents 
the set of actions available for transmitter k. An 
action profile is a vector a = (ai, . . . , ax) G A, where, 

^ = ^1 X ... X Ak- (2) 

In this analysis, the sets JC and {Ak}j.^)c are as- 
sumed finite and non-empty. We denote by a_fc = 
{ai, . . . ,ak^i,ak+i, . . . ,aK) G A^k, where 

A-k = Aix ... X Ak^i X Ak+i X . . . , xAk, (3) 

the vector obtained by dropping off the A;-th component 
of the vector a. With a slight abuse of notation, we write 
the vector a as {ak,a-k), in order to emphasize its k- 
th component. The correspondence fk : A^k 2"^'= 
determines the set of actions of player k which allows 
its satisfaction given the actions played by all the other 
players. Here, the notation 2-^'" refers to the set of 
all possible subsets of the set Ak, including Ak- Note 
that 2-^'' includes also the empty set, which models 
the case when one player ends up without an action 
that allows the satisfaction of its individual constraints 
given the other players' actions. Often, this is a strong 
mathematical constraint and thus, in some sections of 



this paper, we assume that none of the correspondences 
fk maps the empty set. 

In general, an important outcome of a game in satisfac- 
tion form is the one where all players are satisfied. We 
refer to this game outcome as satisfaction equilibrium 
(SE). 

Definition 1 (Satisfaction Equilibrium in PS / l22l/ ).' 
An action profile a+ is an equilibrium for the game 
G = (/C, {Ak}k^K ' {Alfcec) '/ 

VA: G /C, a+ G fk {atk) ■ (4) 

Note that under this formulation, the outcome where all 
players are satisfied is naturally an equilibrium. Here, 
since the aim of each player is to be satisfied, none of 
them has a particular interest on changing its current 
action once they are at the SE. An important remark 
here is that, players are assumed to be careless of 
whether other players can satisfy or not their individual 
constraints. An interesting analysis of the impact of this 
assumption in the definition of equilibrium can be found 
in m. 

In this context, when DCSN are modeled using the 
satisfaction form, radio devices are indifferent to the fact 
that there might exist another transmit configuration with 
which a higher performance e.g., transmission rate, can 
be obtained. Here, as long as each radio device is able to 
satisfy its individual QoS conditions, it has no incentive 
to unilaterally change its current transmit/receive config- 
uration. 

In the following example, we show how a given decen- 
tralized self-configuring network can be modeled by a 
game in satisfaction form. 

Example 1: Consider a decentralized and self- 
configuring network made of a set IC = {!,..., K} of 
transmitter-receiver pairs. For all k IC, let Ak be the 
set of transmit configurations available for transmitter k 
and let the function Uk ■ Ai x . . . x Ak denote 
its (Shannon) transmission rate. Transmitter k must 
guarantee a data rate higher than Tk bps. Hence, the set 
of configurations it must adopt, given the configurations 
a^k of all the other transmitters, is determined by the 
correspondence fk ■ A-k l-^^, which we define as 
follows, 

fk (a-k) = {ak e Ak ■■ Uk {ak, a^k) ^ ^k] ■ (5) 

Thus, the behavior of this network can be modeled by 

the game Q = {IC,{Ak}k£K.Afk}k&K) satisfaction 
form. 

In Sec. |Vl we use this example to highlight the dif- 
ferences between the satisfaction form and other game 
formulations. In the following, we describe the extension 
in mixed strategies of the game in satisfaction form. 
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B. Extension in Mixed Strategies of the Satisfaction 
Form 

The concept of mixed strategies was introduced by 
Borel in |[24l . A mixed strategy of player /c is a probabil- 
ity distribution over the set of actions Ak- We denote the 
set of all possible probability distributions over the set 
Ak by A {Ak), i.e., the unit simplex over the elements 
of Ak- We denote by vr^ = (tt^^^cd , . . . , 7r^_^^(«,)) 
the probability distribution (mixed strategy) chosen by 
player k. Here, for all Uk £ {l,...,Nk}, 7r^,,^("ic) 
represents the probability that player k plays action 
Af^^ e Ak. 

Following this notation, we denote by Q' = 
[K:,{A{Ak)}keKAfk}keJc) extension in mixed 
strategies of the game § = {IC, {AkjkeK ^ {fk}keK:)' 
where the correspondence 

fk-. n ^{Aj)^2^(^^\ (6) 

JG/C\{fc} 

determines the set of all possible probability distributions 
that allow player k to always choose an action which 
satisfies its individual conditions, that is, 

fki7v-k) = {7TkeA{Ak): (7) 
Pr(afc G fk{a-k) \ {'^k,'^-k)) = 1} • 

In this context, we define the SE as follows. 

Definition 2 (Satisfaction Equilibrium in MS): The 
mixed strategy profile tt* G A (-4i) x . . . x A {Ak) 
is a SE of the game G' = (/C, {A {Ak)}keK ' {Alfcgx;)' 
if for all k G JC, 

K e fk (vrl,) . (8) 

From Def. |2] and ([7]), it can be implied that if tt* G 
A (^i) X ... X A {Ak) is a SE, then the following holds, 
for all k G JC, 

Pr(afcG/fc(a_,,)|(7r^,7rl,,)) =1- (9) 
Note that it can be stated that the set of equilibria 

of the game G = (/C, {AjfcgA: > {/fcifeGic) is a sub- 
set of the set of equilibria of the mixed extension 

§' = {JC,{A{Ak)}keJcAfk}keK:)' establish an 

injection from the action set Ak to the canonical basis of 
the space of A^^ -dimensional vectors R^*- . For instance, 
let the rifc-th action of player k, i.e., ^4^"'"^ be associated 
with the unitary vector e'hl''^ = {^el^J , . . . , e^^^'^l^) G 

where, all the components of the vector ei^'''' 
are zero except its n^-th component, which is unitary. 
The vector ei^''^ represents a degenerated probability 
distribution, where the action A^"*"^ is deterministically 
chosen. Using this argument, it becomes clear that every 



SE of the game G = (/C, {A'jfcec ' {Aifeejc) ^1^° ^ 
SE in the game G' = {iC, {A {Ak)}k^ic > {fk}ke!c)- 
As we shall see in the next section, games in satisfaction 
form might not necessarily have a SE neither in pure 
strategies nor mixed strategies. Thus, in the following we 
present a less restrictive notion of equilibrium to which 
we refer to as epsilon-satisfaction equilibrium (e-SE). 

C. Epsilon-Satisfaction Equilibrium 

At the SE of the game G' = 

[lC,{A{Ak)}keJcdfk}k(^Jc)' players choose their 
actions following a probability distribution such 
that only action profiles that allow all players to 
simultaneously satisfy their individual conditions are 
played with positive probability. This interpretation 
leads immediately to the conclusion that if it does not 
exist at least one action profile that allows all players 
to be simultaneously satisfied, then, it does not exists 

any SE in the game G' = (/C, {A (AOIfceiC ' {Alfcec)- 
However, under certain conditions, it is always possible 
to build mixed strategies that allow players to be 
satisfied with a probability which is close to 1 , i.e., 
1 — e, for a sufficiently small e > 0. 

Definition 3 (Epsilon-Satisfaction Equilibrium): Let 
e G ]0, 1]. The mixed strategy profile tt* G A {Ai) x 
... X A {Ak) is an epsilon-satisfaction equilibrium (e- 

SE) of the game G' = (/C, { A (AOIfceiC ' { Alfcex:)' 
for all k £ IC, it follows that 

< e Ik (tt!.,) , (10) 

where 

fki^vlk) = {7VkeA{Ak): (11) 
Pr(afcGA(a_fc)|K,7rlfc)) ^1-e}. 

From Def. [3l it can be implied that if the mixed strategy 
profile TT* is an e-SE, it holds that, 

Pr {ak G fk {a.k) \ {<,^-k) ) ^ 1 " e- (12) 

That is, players are unsatisfied with probability e. The 
relevance of the e-SE is that it models the fact that 
players can be tolerant to a non-satisfaction level. At a 
given e-SE, none of the players is interested in changing 
its mixed strategy profile as long as it is satisfied with 
a probability higher than certain threshold 1 — e. As we 
shall see, a game might not possess a SE neither in pure 
nor mixed strategies, but it might possess an e-SE. 
A thorough analysis on the existence and uniqueness 
of the SE in pure strategies and mixed strategies is 
presented in the next section. Similarly, the conditions 
for the existence of an e-SE are also discussed. 
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IV. Existence and Uniqueness of the 
Satisfaction Equilibrium 

In this section, we study the existence and uniqueness 
of a satisfaction equiUbrium in games in satisfaction 
form and its corresponding extension in mixed strategies. 
Particular attention is given to the existence of e-SE in 
the case where it does not exist at least one SE neither 
in pure nor in mixed strategies. 

A. Existence of SE in Pure Strategies 

In order to study the existence of a SE in the game 

g = {}C,{Ak}k(zK:,{fk}keic)' *e correspondence 
F : A ^ 2-^ he defined as follows: 

F{a) = h (a-i) X ... X (a_^) . (13) 

Then, a SE exists if and only if 

3a A: aeF{a). (14) 

This formulation allows us to use existing fixed point 
(FP) theorems to provide sufficient conditions for the 
existence of the SE. For instance, one can rely on the 
fixed point theorem of Knaster and Tarski ll25l to state 
the following theorem. 

Theorem 4 (Existence of SE infinite games): Con- 
sider the game Q = (/C, {Ak}k<^K •> ifk}keic) '^"'^ 
set A have a binary relation denoted by ^. Let also 
{i) V = {A^ <) be a complete lattice; 
(a) F (a) be non-empty for all a € A; 
[in) the correspondence F in (1131 ) satisfies that 
V (a, a') € J?, such that a < a! , it holds that 

y{b,b')£F{a)xF{a'), b<b'. (15) 

Then the game has at least one SE in pure strategies. 
Note that theorem |4] requires that for all a G A, the set 
F (a) is non-empty, i.e., 

\/k eJC and Va_fc e A-k, 3ak £ Ak : ak £ fk (a^k) ■ 

(16) 

In some cases, this condition might appear restrictive. 
However, in the general context of wireless commu- 
nications, when a radio device might not provide its 
QoS, the default action is simply going into standby. 
This might imply the existence of an action "do nothing 
(DN)" which might appear to avoid the emptiness of 
/fc(a_fc), whenever it is required. Modeling the existence 
of the DN action strongly depends on the scenario. For 
instance, in the case of power allocation games, such 
action can be the null vector, that is, zero transmit power. 
In the following, we study the existence of an equilib- 
rium in mixed strategies and an e-SE, which appear to 
be less restrictive. 



B. Existence of the SE in Mixed Strategies 

As in the case of pure strategies, the condition for 
the existence of a SE in the mixed extension Q' = 

[K:,{A{Ak)}keic^{fk}keic) ^^o^" ^^^^y 

of a fixed point inclusion. Let the correspondence F : 
A (^i) X ... X A (^A-) ^ 2^('4i)x...xA(^^.) be defined 
as follows: 

F(7r) = A (7r_i) x...xfK {tt.k) ■ (17) 

Then, a SE exists if and only if 

37r G A(^i) X ... X A(^i^) : tt G F(7r). (18) 

Thus, all the results of fixed point theory ll26l . in 
the case of the compact and convex sets, are valid 
for the study of the existence of the SE in the game 

g' = [}C,{A{Ak)}k^K:dfk}kGic)- Nonetheless, some 
results are immediate from Def . [2l For instance, note that 
if a game in satisfaction form does not have a SE in pure 
strategies, then, it does not have a SE in mixed strategies 
neither. This is basically due to the fact that players mix 
only the actions that guarantee their satisfaction with 
unitary probability. That is, player k mixes a subset of 
its actions A'l^ C Ak, i.e., Va^ G A'j^, vr^ > 0, only if 
the following condition holds, Va^, G A'j^, 

Pr(4G/fe(a_,,)k-fe) = 1. (19) 

This implies that player k assigns a strictly positive 
probability to more than one action, i.e., it plays strictly 
mixed strategies, only if such a set of actions guarantees 
its satisfaction for all the action profiles a^k S A-k, 
which are played with non-zero probability. This rea- 
soning might imply that, there might exist several SE in 
pure strategies but no SE in strictly mixed strategies in 
the game Q' = {JC, {A (AOI^e^ , { A}feex:)- 

C. Existence of the Epsilon-Satisfaction Equilibrium 

As shown in the previous subsection, the existence of 
at least one SE in the extension in mixed strategies of the 
game § = (/C, {A-}fcgx: ' {Aifceic) remains very strict. 
Indeed, the game has a SE in mixed strategies if and 
only if it has a SE in pure strategies. On the contrary, 
the existence of at least one e-SE is less strict and it 
does not require the existence of a pure SE. A sufficient 
and necessary condition for the existence of at least one 
e-SE is the following. 

Proposition 5: Let § = (/C, {Akjkeic ' {fk}keK) 
a finite game in satisfaction form. Then, if the following 
condition holds, 

yke}C,3aeA: ak G fk (a_fc) , (20) 
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there always exists a strategy profile tt* G A (^i) x 
... X A {Ak) and an 1 > e > such that, n* is an 
e-SE. 

Proof: Assume that the condition ( 1201 ) holds. Then, 
for all j € /C, it holds that the set 



A* = {ae A:aj e fj (a„j)} 



(21) 



is non-empty. Denote by a*j 

ticular element of the set A*. Any mixed strategy tt"*" € 
A {Ai) X ... X A {Ak), such that 



TT 



k,a* 



> 



(22) 



guarantees that \/j € K,, the action a*- is played with 
non-zero probability, thus, Vk G /C, 



Pr (ak G fk (a- 



K 

(23) 



the constellation) might require a higher energy con- 
sumption and thus, reduce the battery life time of the 
transmitters. In this scenario, one might imply that radio 
devices are interested in satisfying their required QoS 
with the lowest effort. Here, we can express the effort, 
for instance, in terms of energy consumption or signal 
processing complexity. 

In the following, our interest is to select an action profile 
which allows all the players to be satisfied with the 
lowest effort. We refer to this action profile as efficient 
satisfaction equilibrium (ESE). 

The game where each player aims to satisfy its QoS 
with the minimum effort can be formulated as a game 
in normal form with constrained set of actions, 

g = (/C, {Ak}keK ' {cfclfcG^ , {fk}k&K.) ■ (25) 

Here, for all k £ IC, the cost function : Ak [0, 1] 
satisfies the following condition, V(afc, a'^) G A^., it holds 
that 



K 



where 1 ^ Cfe ^ TT T^fa" > 0, which proves the 

' k.j 

existence of a mixed strategy profile such that, Mk G K,, 



Ck (flfc) < Ck (flfc) 



(26) 



Pr (ak G fk (a_ 



TT 



^ 1 



(24) 



where, e = 1 — min e,, which completes the proof. ■ 

Note that for all k £ JC, the condition (l20l ) only requires 
the existence of at least one action profile where player 
k is satisfied, which is less restrictive than conditions 
([T4l ) and (dUl. Note also that, as long as holds, a 
simple uniform distribution over each individual set of 
actions Ak is an e-SE. Indeed, under the assumption of 
a uniform probability distribution, the following lower 

1 

bound can be identified e ^ 1 — TT — ■ Indeed, strict 

j = l J 

equality is observed in a game with no SE in pure 
strategies and satisfying that VA; G /C, there exists a 
unique action profile a^^^ such that a^"''^ G fk (a^^^) 
and y{j,k) G /C^, a^^) / a^. 



D. Multiplicity of the SE and Equilibrium Selection 

In general, it is difficult to provide the conditions to 
observe a unique SE for a general set of correspondences 
{fk}k€K- As we shall see in Sec. |VII[ the set of SE is 
often not unique in games modeling decentralized self- 
configuring wireless networks, and thus, an equilibrium 
selection process might be required. 
We start our analysis pointing the fact that using a 
higher transmit power level or using a more complex 
modulation scheme (e.g., in the sense of the size of 



if and only if, requires a lower effort than action a'^„ 
when it is played by player k. For instance, if transmitters 
are choosing the transmit power level, one can consider 
the cost function Ck as the identity function, that is, the 
transmitters are concerned by the amount of power they 
use on their transmissions. 

Following this reasoning, the set of ESE of the game 
§ = {lC,{Ak}k(zKdfk}k(^K)' with respect to the cost 
functions Ck, corresponds to the set of GNE of the game 

It is important to note that in the game Q = 

(/C,{A'}fceC'{cfc}fceC'{A}fcGc)' *e competitive in- 
teraction between all players is not modeled by the cost 
functions {ck}keJC- For instance, the cost function of 
player k, Ck, depends only on its chosen action ak. In 
this game formulation G, the interaction between players 
is modeled by the correspondence fk, which is defined 
over the set of action profiles A^k- 
An important remark here is that if all players assign the 
same cost (or effort) to all their actions, then the sets of 
ESE and the set of SE are identical. This implies that the 
interest of the formulation Q is precisely that players can 
differentiate the effort of playing one action or another in 
order to select one (satisfaction) equilibrium among all 
the existing equilibria of the game Q. Thus, the existence 
and uniqueness of this efficient SE plays an important 
role in the equilibrium selection. We analyze these two 
properties in the sequel of this section. 

1) Existence of an ESE: The main results on the 
existence of an ESE are based on the fact that the game 
g = (/C, {Ak}keic ' {ck}keic ' {AI^gc) a potential 
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game with constrained actions sets (see Appendix lAjl, 
as shown in the following proposition. 

Proposition 6: The game Q = 
(/C,{A}fce^,{cfc}fce^,{/fc}fc6^) is an exact 
constrained potential game, with potential function 
(j) : such that, Va G A, 

K 

(^(a) = ^ Cfc (afc) . (27) 

k=l 

This simple observation allows us to obtain immediate 
results on the existence of the ESE: 

Proposition 7 (Existence of the ESE): The game 
Q = (/C, {Ak}keic , {cfclfcg^ , {fk}keic) <E5]l has at least 
one efficient SE if the graphs of the correspondences 
/i ) ■ ■ ■ 5 fx cir^ non-empty and identical. 
The proof of Prop. |7] follows from Th. [14] in Appendix 
lAl Other results, on the existence of an ESE and more 
importantly on the convergence of the best response 
dynamics |[T9l to achieve an ESE are immediately ob- 
tained by exploiting properties such as supermodularity 
or submodularity of the potential function (See lITTl ). 

2 ) Multiplicity of an ESE: One the desirable features 
of the ESE is to be unique. In particular, this will 
allow engineers to estimate the operating point of decen- 
tralized networks. However, from existing literature in 
constrained games Ul], it can be stated that uniqueness 
is observed only under certain conditions. Thus, in 
Appendix |Bl we provide a general method to analyze 
the number of ESE of a given game Q. This method 
is basically a tool for the network analysis, it does 
not pretend to be an algorithm which can be directly 
implemented in networks. 

In the next section, we compare the SE notion with 
existing equilibrium concepts such as NE and GNE. 

V. Satisfaction Equilibrium and other 
Equilibrium Concepts 

In the following, we highlight the main differences 
between the SE and other equilibrium notions such as 
NE and GNE. However, before we start, we point out the 
differences between the normal form and the satisfaction 
form formulations. 

A. Games in Normal Form and Satisfaction Form 

The main difference between the normal form Q = 
()C, {Ak}k^]c ' {'^k}keK.) ^'^'^ satisfaction form G = 
{JC, {Ak}k^^ , {/fclfceic) former defines a 

preference order ll27ll . which can be modeled by a 
family of utility functions ui,...,uk in the sense 
of von Neumann - Morgenstern [28], i.e., given an 
action profile a ^ £ A-k^ player k can rank any 



pair of its actions (afc,a'^) € Al such that ei- 
ther Uk{ak,a^k) < Uk{a'j^,a_k), Uk{ak,a_k) = 
Ukia'^, a^k) or Uk{ak, a^k) > Uk{a'^, a^k)- In the latter, 
player k determines only whether an action satisfies 
its individual conditions or not, i.e., € fk {(i-k) 
or Gk ^ fk respectively. Hence, the satisfaction 

form does not require the existence of a utility function 
|[29l . |[2l but a satisfaction correspondence fk, which 
can be seen as an additional degree of freedom of this 
formulation. 

An interesting observation is that, by assuming a utility 
function of the form : ^ — )• {0, 1} a game in normal 
form can also be used to describe a situation when 
players are interested only on the satisfaction of their 
individual constraints. Nonetheless, the notions of Nash 
equilibrium (NE) and SE do not necessarily correspond 
to each other. The NE in pure strategies in the context 
of games in normal form |[30l can be defined as follows. 

Definition 8 (Nash Equilibrium in PSjj^): Consider 

a game Q = (/C, {Ak}keK. ' {""fclfce/c) normal form. 
An action profile a ^ A is an NE in pure strategies if it 
satisfies, for all k £ IC and for all a'l^ G Ak, 

Uk{ak, a^k) ^ Ukia'k, a^k)- (28) 
Now, consider the Ex. [T] and define the game in sat- 
isfaction form g = (/C,{A}fee/C'{/fc}fcGK:)' with fk 
defined by ([5]) and a game in normal form Q = 

(/C, {Ak]k(iK ' {^ifc}fcg/c)' with Uk : Al >i . . . y. Ak ^ 
{0, 1} defined as follows: 

Uk{ak,a_k) = l{a.g/,(a_.)}- (29) 
Now, we compare both the set of SE of the game 
g = (/C, {A-}fcG/c ' {/fcifce/c) and *e set of NE ^ne 
of the game g = {K., {-^k]k(iK^{^k}keK)- ^^^e that 
from Def. [8] and Def. [T] it can be immediately implied 
that any SE of the game g is an NE of the game g. 
This is basically, because at the SE, all players obtain a 
unitary utility, and since the range of the utility function 
is binary {0, 1}, no other action is able to give a higher 
utility. The converse is not true, that is, an NE of the 
game g = (/C, {^fc}fcg/C' {^fclfce^;) necessar- 
ily a SE of the game g = (/C, {A>}fcg;c > i/fcifce/c)- 
Consider for instance the game realization {K = 
2, Ni = N2 = 2) in Fig. [2 Note that therein, 

the game g = (/C, {^fe}fcg/c ' {^tifce/c) has 2 NE 
in pure strategies, which are the action profiles 
and (A^^\Af^), while the game § = 
(/C, {Ak}k(iK ' {/fc}fce/c) o'^ly '^E' which is the 
action profile (A^^^^Ag^^). This simple example shows 
that, the formulation in normal-form might lead to equi- 
libria where not all the players are satisfied even when 



8 



Pl\P2 




aC2) 
^2 




(0,0) 


(1,1) 




(1,0) 


(0,0) 



Fig. 1. Game in normal form Q — [K,, {-4^}^^^^ , {wfe};,^^), witli 
K. = {1,2}, Ak = {A[^\A^^'>}, for all k e K.. Player 1 chooses 
rows and player 2 chooses columns. In a pair [vi, V2) G {0, 1}^, v\ 
and V2 are the utilities obtained by player 1 and 2, respectively. 

the joint satisfaction of all players is feasible. This shows 
that games in normal-form do not properly model the 
case where players are interested only in the satisfaction 
of individual conditions. We conclude the comparison 
between the games Q = (/C, {Ak}^^,^ , {uk}kaK) 
Q = {JC, { A}fcG/c ' {fk}keic)' estabUshing the fol- 
lowing condition between their sets of equilibria, 

^SE C ^NE ^ A. (30) 

This confirms the intuition that the notion of SE is more 
restrictive than the notion of NE, that is, an SE in the 
game Q is a. Pare to optimal NE in the game Q, that is, 
an action profile where all players are satisfied. 

B. Satisfaction Equilibrium and Nash Equilibrium 

The definition of NE (Def. [H) can be obtained from 
the definition of SE (Def. [1]) by assuming that, for all 
k ^ JC, the satisfaction correspondence fk is defined as 
follows, 

fk {a^k) = arg max Uk {al, a_k) . (31) 

al&Ak 

The satisfaction correspondence fk as defined in (OTI ) 
is known in the game theoretic literature as the best 
response correspondence ll30l . Then, under this formula- 
tion, the set of SE of the game in satisfaction form Q = 
(/C, {Ak}keK: , {/fcifceAc) identical to the set of NE of 
the game in normal form g = (/C, {Ak}keic , i^fcifce/c)- 
This reasoning might lead us to think that the satisfaction 
form as well as the notion of SE are generalizations 
of the classical normal form and the notion of Nash 
equilibrium ||9l, respectively. 

C. Satisfaction Equilibrium and Generalized Nash Equi- 
librium 

The GNE in pure strategies (PS) in games in normal 
form with constrained set of actions, as introduced by 
Debreu in ||7| and later by Rosen in |[8l, can be defined 
as follows. 

Definition 9 (Generalized NE in PS H^): An action 
profile a* G A is a generalized Nash equilibrium (GNE) 



of the game G = (/C, {Akjkeic , {^fcifce^c , {fk}keic) '/ 
and only if 

Mk G /C, G fk {(i*-k) 

Vflfc e fk (alk) , Uk{a*k,a*^^) ^ Uk{ak,a*_jf). 

Note that the definition of SE (Def. [J) can be obtained 
from the definition of GNE (Def. Hi by assuming the 
following condition, V/c G /C and Va G A 

Ukiak, a_k) = c, with c G 1R.+ . (32) 

Under assumption (l32l ). the set of GNE of the game 
in normal form with constrained set of actions Q = 
(/C, {Ak]k(^ , {-"fclfce^c , {/fcifce/c) and the set of SE of 
the game G = (/C, {Ak}keic , {Alfcec) satisfaction 
form are identical. This observation does not necessarily 
imply that the satisfaction form as well as the notion of 
SE are particular cases of the classical normal form with 
constrained set of actions and the notion of GNE Q, 
respectively. Indeed, there exist fundamental differences: 
(i) in the game G, the set of available actions for player 
k are determined by the complementary vector a_fc. On 
the contrary, in the game G, the set of available actions 
of player k is always the set Ak- (H) In the game G, 
a rational player k determines the action to play by 
following two different steps. First, it determines the set 
of available actions fk{a^k) and second, it determine the 
actions G /fc(a-fc) that maximize Uk- In contrast, in 
the game G, player k does not require any optimization 
capability. {Hi) In the game G, the interpretation of 
fk{a-k) = is that player k cannot play since none 
of its actions is available given the actions of the others 
a_fe. On the contrary, in the game G, the interpretation 
of fk{cL-k) = is that player k can take any of its 
actions G Ak, but none of them achieves satisfaction. 
This difference might appear subtile but it makes a 
strong difference when the equilibrium must be learnt 
dynamically |[T9l . Indeed, in an eventual exploration 
phase of a learning algorithm, at each stage, player k 
always has a non-empty set of actions to test in the 
game G regardless of the actions of all the other players. 
Clearly, this is not the case in the game G, which 
constraints the learning process. 

In the following, we compare the set of equilibria of 
both games G = {JC, {Akjk^jc , MkeK , i/fcifceic) and 
G = {JC, {Ak}keK , {/fclfce/c)' ^ general definition of 
the utility functions Uk, for all k £ JC. Let the sets of 
GNE of the game G and the set of SE of the game G 
be denoted by ^gne and ^SE, respectively. Now, note 
that from Def. |9] and Def. [B it follows that any GNE in 
^ is a SE in G, i.e., 

Agne ^ AsE ^ A. (33) 
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The strict equality ^gne = -^SE is achieved when the 
functions Uk are chosen following (l32l ). The condition 
in (|33] ) verifies the intuition that the notion of SE in 
games in satisfaction form, is less restrictive than the 
notion of GNE in games in normal form with constrained 
action sets. Note also that from Def. [H it might be 
implied that several SE might exist, while no GNE 
necessarily exists. This is basically due to the fact that 
the existence of a GNE depends on both the functions 
life and /fe, while the existence of a SE depends uniquely 
on the correspondences fk, with k £ IC. Conversely, the 
existence of a GNE implies the existence of at least one 
SE. 

In the following, we focus on designing behavioral rules 
for the radio devices in order to let them to learn one 
satisfaction equilibrium in decentralized self-configuring 
networks. 

VI. Learning Satisfaction Equilibrium 

In this section, we study a behavioral rule that allows 
radio devices to learn a satisfaction equilibrium in a fully 
decentralized fashion. Here, the underlying assumption 
is that players do not need to observe the value of its 
achieved utility, i.e., transmission rate, energy efficiency, 
etc., but only to know whether they are satisfied or not at 
each stage of the learning process. This implies only a 1- 
bit length message exchange between the corresponding 
transmitter-receiver pairs. In the following, we formulate 
the corresponding learning problem and later, we intro- 
duce the behavioral rules that allow players to learn the 
SE. 



A. The Learning Problem Formulation 

We describe the SE learning process in terms of 
elements of the game g = (/C, {^fcl^gyc ' {A-lfceic) 
follows. Assume that time is divided in intervals and 
denote each interval by the index n € IN. Each interval 
ends when each player has played at most once. Denote 
the action taken by player k at interval n by ak{n). At 
each interval n, player k observes whether it is satisfied 
or not, i.e., it observes a binary variable 
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{ak{n)efkia-k{n))}- 



(34) 



Our intention is to learn at least one SE by letting 
the players to interact following particular behavioral 
rules. We say that players learn an equilibrium in pure 
strategies if, after a given finite number of time inter- 
vals, all players have chosen an action which achieves 
satisfaction, and thus, no other action update takes place. 



B. Learning the SE in Pure Strategies 

Before we present the behavioral rule which allows 
players to achieve one of the equilibrium of the game 

g = {IC,{Ak}k(zjcAfk}kGJc)' we state the following 
hypothesis: 



(i) The game 
least one SE in pure strategies. 



g = (/C,{AW'{/fe}feG^) has at 

1 1 c f"r''i t-a m ac- 



(ii) For all /c € /C, it holds that Va_fc e A^k, the set 
fk (a-fc) is not empty. 

[iii) The sets /C and {Ak}keiC' finite. 

The first hypothesis ensures that the SE learning problem 
is well-posed, i.e., radio devices are assigned a feasible 
task. The second hypothesis refers to the fact that, each 
radio device is always able to find a transmit/receive 
configuration with which it can be considered satisfied 
given the transmit/receive configuration of all the other 
radio devices. This assumption might appear restrictive 
but it is not necessarily the case, see the discussion on the 
"do nothing" action in Sec. IIV-AI The third hypothesis is 
considered in order to ensure that our algorithm is able 
to converge in finite time. 

Under the assumption that all hypothesis hold, each 
player chooses its own action as follows. The first action 
of player k, denoted by 0^.(0), is taken following an 
arbitrary probability distribution 7rfc(0) € A {Ak)- Often, 
such a probability 77^(0) is the uniform probability 
distribution. At time interval n > 0, player k changes its 
action if and only if it is not satisfied, i.e, Vk{n — 1) = 0. 
In this case, the next action is chosen following a 
probability distribution 7rfc(n) to which we refer to as 
probability distribution of exploration. If player k is 
satisfied i.e, Vkin — 1) = 1, then, it keeps playing the 
same action. Hence, we can write that, 

afc(n-l) if ■Dfc(n-l) = l 

afc(n)~7rfc(n) if ^^(n - 1) = 

The behavioral rule (1351 ) is based on the proposal in |[2TI . 
Here, note that a larger class of individual constraints 
can be considered due to the formulation of (|34l ). Under 
this formulation, the room for optimization is on the 
design of 7rfc(n) and its evolution over time. However, 
we left this issue out of the scope of the paper and no 
particular probability distribution is assumed. We for- 
malize the behavioral rule (1351 ) in the Alg. [T] Regarding 
the convergence of this behavioral rule, we provide the 
following proposition. 

Proposition 10: The behavioral rule (1351 ) with prob- 
ability distributions tt^ = 



ak[n) 



• (35) 
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Algorithm 1 Learning the SE of the Game Q 

Require: At each instant n > 0: Vk{n). 
1: n = 0; 

2: Vnfc e {l,...,Nk}, 



0, 



afe(O) ~ 7i-fe(0); 
for all n > do 

Vrifc G {1, . . . , iVfc}, update 7Tk{n). 



ak{n) 
end 



afc(n-l) ifi;fc(n-l) = l 

afc(^) otherwise. 



A (Ak), with k £ IC, converges to a SE of the game 
§ = {IC,{Ak}k(zK:Afk}keK.) infinite time if for all 
k & }C and for all £ {1, ■ ■ ■ , N^}, it holds that, 



(36) 



at each time interval n S M, and assumptions (i), (ii) 
and {in) always hold. 

The proof of Prop. [TO] follows simply from the fact that 
(l36l ) implies that every action profile will be played at 
least once with non-zero probability during a sufficiently 
large time interval. Then, since at least one SE exists, 
this action profile will be played at least once. Now, 
from (|35] ). it follows that once a SE is played, no 
player changes its current action. Thus, convergence is 
observed. 

From the reasoning above, it can be concluded that any 
probability distribution 7rfc(n) such that all actions have 
a non-zero probability of being played, for all n, can 
be chosen as the probability distribution of exploration. 
However, the choice of this probability distributions 
might impact the convergence time. Two particular ways 
for building the probability distribution 7rfe(n) are pro- 
posed in II2TI . In the first case, a uniform probability 
distribution during the whole learning process was used. 
That is, for all A; G /C and for all n^. € {1, ... , K}, 

\A[-M = jf^- (37) 

In the second case, at time interval n, higher probabilities 
are assigned to actions which have been played a smaller 
number of times during all time intervals between and 

n - 1. Let T, A^^k)(n) € M, with k e IC and € 



{I, . . . , Nk}, be the number of times that player k has 
played action A^^''^ up to time interval n, i.e.. 



n-l 



s=Q 



{a.{.)=A^-'}- 



(38) 



Then, the probability distribution to select the next action 
is the following: 



E 



1 



(39) 



where ^(^^j (0) = 6, with 6 > 0. 

C. Clipping Strategies and SE 

The behavioral rule ( [35l ) converges to a SE in pure 
strategies in finite time. However, in real system sce- 
narios, it is often observed that there might exists an 
action from a given player, which achieves satisfaction 
regardless of the actions adopted by all the other players. 
We refer to this kind of actions as clipping actions Il22l . 

Definition 11 (Clipping Action): In the game Q = 
(/C, {Ak}ke]Q ' {fk}k(^)cj, a player k £ K. is said to have 
a clipping action € Ak if 



ya^k G A-k, ak € fk (o-fc) 



(40) 



As shown in the following proposition, the ex- 
istence of clipping actions in the game Q = 
(/C, {Alfcec ' {fk}keK) "light inhibit the convergence 
of the behavioral rule in (|35] ). 

Proposition 12: Consider the game 

§ = {IC,{Ak}keJcAfk}keK.) satisfaction form. 
Assume the existence of at least one clipping action 
and denote it by a*^ G Ak for player k, with k G IC. 
Then, if there exists a player j G /C \ {k}, for which 
fj (a^,a_{j-fc}) = 0, ya_{j^k} G H 

ieic\{j,k} 

behavioral rule in ( 1351 ) does not converge to a SE with 
strictly positive probability. 

The proof of Prop. [12] follows from the fact that at time 
n > before convergence, the probability that player 
k plays the clipping action is strictly positive ( [36l ). 
If player k plays a^, by definition, there exist a player 
j ^ k which would never be satisfied. Then, the be- 
havioral rule does not converge to any SE. Nonetheless, 
simple alternatives can be used to solve this problem. 
For instance, the behavioral rule in (|35] ) can be modified 
such that a player changes its current action (using a 
given probability distribution over the actions) even if it 
is satisfied when it sees the other players not satisfied 
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during a long period. Nonetheless, in this case, players 
would need to have more than 1-bit feedback in order 
to detect the non-satisfaction of the others. For instance, 
the feedback of the instantaneous value of the metric 
performance. This approach can be compared with the 
idea of epsilon experimentation discussed in |[3T| . 

VII. Applications 

In this section, we apply the concept of SE to the case 
of a classical interference channel |[32l with 2 pairs of 
transmitter-receiver pairs sharing a common bandwidth. 
Here, the notion of SE are compared with the existing 
equilibrium notions such as GNE. At the same time, the 
performance of the behavioral rules presented in Sec. |Vl] 
is evaluated in terms of convergence time to a satisfaction 
equilibrium. 

A. Satisfaction Equilibrium in the Interference Channel 

Consider a set /C = {1, 2} of two transmitter-receiver 
pairs simultaneously operating over the same frequency 
band and thus, subject to mutual interference. Each 
transmitter communicates only with its corresponding 
receiver and any kind of message exchange aiming to 
achieve transmit cooperation is not considered. For all 
(j, A;) € K?, denote by gj^k and p~^^^ the channel gain 
between transmitter k and receiver j, and the n^-th 
transmit power level of transmitter k, respectively. We 
denote by Ak = \pk \ ■ ■ ■ the set of all possible 

transmit power levels of player k. For all /c € /C, the 



minimum transmit power is p)^' =0 and the maximum 
transmit power is p'^^'^ = Pk,max- The QoS metric, 
denoted by Uk '■ Ai x A2 ^ 11+ , of the transmitter- 
reciever pair k is its (Shannon) transmission rate in bits 
per second (bps). Thus, for all {pk,p-k) € Ak x A-k, 
we write that. 



/ NT (11 Pk9k,k 
Uk {Pk,P-k) = log2 1 + ■ 

af.+p_kgk-k 



[bps/Hz]. 
(41) 

Here, cr|, is the noise level at receiver k and we denote 
the signal to noise ratio at the transmitter k by SNR^ = 
Eh2i^_ Xhe QoS requirement for player k is to provide 
a transmission rate higher than bps. Thus, we model 
the satisfaction correspondence f^, as follows. 



fk (p^ 



{pk € Ak : Uk {pk,p-k) ^ Tfc} . (42) 



We assume also that transmitters associate different 
effort measures with each of their power levels. The 
higher the transmit power, the higher the effort. 
This scenario is modeled by a game in satisfac- 
tion form g = {lC,{Ak}k(zJc^{fk}keK.) ^"^^ ^ §>^^^ 



O A, lliL'VliblL' RutL's 

Miuimum Rate Tx. 2 

<) SatiBfsction Equilibria 

O Eificient Satisfaction Equilibria 

X Raterauca Poiut (p,,„.,P2.„.J 



Fig. 2. Achievable (Shannon) 

{ui{pi,p2),U2{pi,p2)), for all (pi,P2) 
SNR = = 10 dBs, (Pi.Ta) -- 

Ni — N2 — 32 power levels. 



transmission rates 
G Ai X A2, with 
- (1.5, 1.5) bps and 



in normal form with constrained action sets Q = 

(/C,{AW,K'}fcgyC'{/fcW)' where, for all k € 
/C, the cost or effort function Ck is defined as follows 



Ck (Pk 



Pk, 
Pk 



+ 6 if Pk 



if p.€{pr,...,pf^^}, 



(43) 

where 5 > Q. Note that the most costly action is not to 
transmit. This choice is made to force the radio devices 
to transmit any time it is possible. The correspondence 
is defined as follows: 

if fk iP-k) + 
otherwise . 

(44) 

Here, we include the non-transmission action p^^ = 
in order to avoid an empty set of actions for players k, 
when it does not exists an action able to achieve the 
required minimum rate. 

Note that if the following holds, V/c € /C, € Ak, 
such that 



h {p-k) 



logs 1 + 



Pk,gk,k 



(45) 



C^fc +P-fe,max5fc,-fc, 

the existence of at least one SE in ensured from Theorem 
m Note that under such an assumption fk{a~k) is never 
empty (condition {ii)) and by establishing the compari- 
son bigger or equal than (^) as the binary relation ■< in 
A, the condition (i) and {iii) are always verified. 
In Fig. |2j we plot (in red circles) all the achievable (Shan- 
non) transmission rates for both transmitters, i.e., the 
pairs {ui{jpi,p2),U2{pi,P2)), for all (pi,p2) G ^i x ^2 
and a particular channel realization. All the equilibria of 
the games Q and Q are plotted. The game Q has a unique 
equilibrium which is the pair (0, P2,max 

)(see (Def.|9l)and 

reference point (pi,max5P2,max ) in Fig. 111). The game Q 
has multiple equilibria (Def. [T]). In particular, note that 



12 





^^■TJniforni Distribution 

M Russ-2()(17" 




























































II .. .. 







40 60 80 100 

Time IiiLervals before Convergence 



0.7 
0.6 
0.5 
&0.4 

s 

I 0.3 
0.2 
0.1 




^HUniloiiii Distiibutioii 




M[R.oss-2007] 





11 




Fig. 3. Histogram of the convergence time to a SE in the game 
Q = {AjfegK , {/fclfeg^:) algorithm (Alg. [TJ. Here, 

SNR = = 10 dBs, (ri,r2) = (1.5,1.5) bps and Ni = 

= 32 power levels. 



Fig. 4. Histogram of the event of convergence or non- 
convergence of the learning algorithm (Alg. [TJ in the game Q = 
{lC,{Ak},e,cdfk}keK)- Here, SNR = = 10 dBs, 

(Ti, = (1.5, 1.5) bps and Ni = N2 = 32 power levels. 



with the game in normal form with constrained strate- 
gies, it is not possible to simultaneously satisfy the QoS 
of both transmitters. In this case, only transmitter 2 can 
be satisfied. On the contrary, in the game formulated in 
satisfaction form, all players are able to satisfy their QoS 
demands at the equilibrium of the game. Importantly, 
the ESE satisfies the QoS condition for both transmitters 
with the lowest transmit power, while all the other SE 
requires a higher transmission power. In particular, note 
that (with this particular channel realization) the set 
of GNE and ESE appear to be unitary. However, as 
shown before, the existence and uniqueness of the ESE 
and GNE are conditioned in general. With this simple 
example, we have shown that by including the notion 
of performance maximization, i.e., the notion of GNE, 
leads to an unsatisfying game outcome, where only one 
player is satisfied, while the simultaneous satisfaction of 
both players is feasible. 

B. Clipping Actions in the Interference Channel 

Note that the game Q with the particular channel 
realization used in Fig. |2] possess at least one clipping 
action. For instance, when transmitter 2 transmits at 
the maximum power P2,max> it is always satisfied even 
if player 1 transmits at the maximum power (see the 
reference point (i5i,max,P2,max) in Fig.|2]). At the same 
time, if player 2 transmits at the maximum power, player 
1 is unable to achieve satisfaction. Hence, if before 
observing convergence, transmitter 2 uses, for instance, 
its maximum transmit power, then convergence to a SE 
is not observed neither in finite nor infinite time. 
In Fig. m we show a histogram of the convergence or 
not convergence of the algorithm. Here, we say that the 
algorithm does not convergence if during 100 consec- 
utive time intervals, a given player does not change its 



current action, while the other still does (this implies that 
a clipping action might be being played). At each trial of 
the algorithm, we use the same channel realization used 
in Fig. |2] Note that independently of the probability dis- 
tributions TTk{n) adopted by player k to try new actions, 
the event of one player playing a clipping strategy is non- 
negligible (0.3). In the particular case of the interference 
channel as treated here, the corresponding game is free 
of clipping actions if the simultaneous transmission at 
maximum power allows satisfaction. However, in this 
case, the distinction between SE and NE looses its 
importance since both equilibrium concepts would be 
able to give a satisfactory solution to the QoS problem. 
This observation leaves open the way for further research 
on learning algorithms in the context of the SE in the 
presence of clipping actions. 

C. Convergence Time to the SE 

Now, our interest focuses on the average time for 
converging to one SE of the game Q, when convergence 
is observed in the previous experiment. The convergence 
time is measured as the number of action updates re- 
quired to each transmitter before convergence. In Fig. [3j 
we show a histogram of the convergence time when play- 
ers try new actions with the probability distribution in 
(I37] ) and ( [39l ). Note that in this particular scenario, using 
a probability different from the uniform distribution does 
not bring a significant improvement. Interestingly, the 
histogram shows that if convergence is observed, most 
of the time (80%), satisfaction is achieved in less than 
20 time intervals (action updates). 
In Fig. [5j we plot the achieved transmission rate of both 
links at each instant n when the behavioral rule (1351 ) 
is used. Therein, it can be observed that even though 
a transmitter is satisfied, and thus does not change its 
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Fig. 5. Instantaneous achieved rates of transmitter 1 (red) and 2 
(blue).Here, SNR = £^p2i = 10 dBs, (ri,r2) = (1.5,1.5) bps 
and A'^i = = 32 power levels. 

transmission power level, its instantaneous transmission 
rate changes due to the action updates of the other 
transmitters. Once both transmitters are satisfied, then, 
none of them changes its transmit powers. 

VIII. Conclusions 

The game formulation in satisfaction form (SF) and 
the notion of satisfaction equilibrium (SE) introduced 
in this paper have been shown to be neatly adapted to 
model the problem of QoS provisioning in decentralized 
self-configuring networks. At the SE, all players are 
satisfied. On the contrary, when the QoS provisioning 
problem is modeled by games in classical normal form 
or normal form with constrained set of actions, equilibria 
where not all the players achieves satisfaction might be 
observed, even when there exist action profiles that allow 
the simultaneous satisfaction of all players. The notion 
of SE has been formalized in the context of pure and 
mixed strategies and its existence and uniqueness has 
been studied. In particular, when no SE exists neither 
in pure nor mixed strategies, necessary and sufficient 
conditions for the existence of an epsilon-SE has been 
presented. However, not all games in SF possess an 
e-SE. Finally, a learning dynamics has been proposed 
to achieve SE. In particular, we remark that it requires 
only 1-bit feedback messages between the corresponding 
transmitter-receiver pairs. Nonetheless, the convergence 
remains still conditioned. This suggests that the design 
of algorithms such that at least one SE is learned in finite 
time and in a fully distributed fashion remains being an 
open problem. 

Appendix A 
Potential Games with Constrained Set of 
Actions 

In this appendix we present a generalization of a class 
of games known as exact potential games (PG) |[33l . 



We refer to this new class of games as constrained 
exact potential games. First, consider a game in normal 
form with constrained strategies and denote it by = 

(/C, {Ak}k(iK. ' {""fclfce/c ' {S'fclfce^)- Let the set Tk d A 
be the graph of the correspondence g^, hence, 

= {(a/c, a^k) € A: ak £ Ok (a^k)} ■ (46) 

The set determines the action profiles which can 
be observed as outcomes of the game Q, when only 
player k is allowed to play given any action profile a_fc 
for which the set (a-fe) is not empty. Following this 
reasoning, the set of all possible outcomes of the game 
Q corresponds to the following set 

K 

-^=n-^i' (47) 

i=i 

which is the set of action profiles such that Va G T, it 
holds that Vfc € /C, € gk {a~k)- Unilateral deviations 
of a set of players from any action profile a £ T might 
lead to action profiles which do not belong to T. The 
following set 

K 

i=i 

contains all possible unilateral deviations one can ob- 
serve from any action in the set T. Using both sets T 
and J^, we introduce the definition of exact constrained 
potential game. 

Definition 13 (Exact Constrained PG (ECPG)): Any 
game in normal form with constrained set of actions 

constrained potential game (ECPG) if there exists a 
function : — > R such that for all a £ J^, it holds 
that, for all k £ K and for all a'^ € g^ (ot-fc), 

Before we continue, we clearly state that not all the 
the properties of potential games |[33| hold for the 
constrained potential games. For instance, not all exact 
constrained PG have an equilibrium. In the following, 
we introduce two results regarding the existence of an 
equilibrium in pure strategies in ECPG. 

Theorem 14 (Existence of an equilibrium in ECPG): 
The finite exact constrained potential game 

g = (/C, {AJfcg^c ' {^felfce^ ' {9k}k&K)' ^^f^ potential 
function (j) : T ^ ]R-|_, has at least one equilibrium in 
pure strategies, if the sets J^i , . . . , J^k, otre non-empty 
and identical. 

Proof: By assumption, the set T (1471 ) is non-empty. 
Thus, there exists at least one feasible outcome a* £ T 
for the game g. Now, for all k £ IC, any unilateral 
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deviation of player k from an action profile a* leads to 
an action profile of the form (afc,al^) G T. Similarly, 
by assumption, both sets T and T are identical, thus, any 
unilateral deviation from a feasible action profile is also 
a feasible action profile. Now, without any loss of gener- 
ality, let the elements of the sets T = \^A^^\ . . . jA'^^^^ 
be indexed following any particular order such that the 
following holds, 

,^(i«)«;0(i(2))^...<c<^(iW), (49) 

with N = Thus, from Def |9] it holds that il^^) 
is an equilibrium of the game Q , which completes the 
proof. ■ 

Appendix B 
Multiplicity of the ESE 

In this appendix, we study the multiplicity of the equi- 
Ubrium of the potential game with constrained strategies 

G = (/C,{A}fceyc'{cfe}fceyc'{/fc}fceJ- doing so, 
we take advantage of the fact that it is a potential game 
with constrained action sets and thus, we analyze the 

auxiliary game G' = (/C, { Ajfcg^c ' Wfcex; ' {A W)- 
Note that this choice does not imply any lost of gener- 
ality, since the set of GNE of both games are identical. 
In the following, we use some tools from graph 
theory to determine the number of ESE which 
a given game in efficient-satisfaction form Q = 
{JC, {Ak}k^^ , {ck}k&^ , {fk}keK) can possess. We start 
by indexing the elements of the action set A in any given 
order using the index n ^ I = {1, . . . ,\A\}. Denote by 
a(") = (^ai^\ . . . ,a^x^^ ^^'^^ element of the action 
set A. Consider that each action profile a(") is associated 
with a vertex x„ in a given directed graph G. There 
exists an arc from vertex Xn to another vertex Xm, if 
the action profile represented by the latter a^™) can be 
obtained from the former a^") by changing the action of 
only one player and lower potential (sum of efforts) is 
obtained. For instance, if the unique deviator is player 
k, then, 4""^ G /fc (a^l) and 0(aW) > ,/.(a(™)). 
More precisely, the graph G can be defined by the pair 
G = {X, B), where the set X = jxi, . . . , (nodes) 
contains the nodes representing the action profiles in the 
set A and B (edges) is a non-symmetric matrix with 
dimensions \A\ x |^| and entries defined as follows 
V(n, m) € and n ^ m, 

■ 1 if (ijaiteJC: 

(iii) (/)(a(™)) < (/.(a(")) 

otherwise , 

(50) 



and 6j j = for all i € X. 

A realistic assumption is to consider that for any pair 
of action profiles a^") and a^™^ which are adjacent, 
we have that (/)(a(")) ^ (l){a^"^^). This is because 
players assign different effort values to their actions. 
From the definition of the matrix B, we have that a 
necessary and sufficient condition for a vertex Xn to 
represent an ESE action profile is to have a null out- 
degree in the oriented graph G, i.e., there are no outgoing 
edges from the node x„ (sink vertex). Finally, one can 
conclude that determining the set of ESE in the game 
g = (/C,{A}fce^,{cfc}feg^,{/fc}fcgJ boils down to 
identifying aU the sink vertices in the oriented graph 
G. That is, exploiting the fact that, if the n-th row of 
the matrix S is a null-vector, then the action a^") is 
an ESE of the game Q. Interestingly, a particular case 
arises when the resulting graph is an edgeless graph, 
i.e., the corresponding matrix S is a null matrix. In this 
case, the set of SE would be identical to the set of ESE, 
which implies that the idea of associating an effort to 
each action is not enough to select an ESE among the 
set of SE. In any case, determining the exact set of SE 
would require the analysis of the matrix B, which might 
be highly demanding and requires complete information. 
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